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Abstract 

We investigate the sensitivity of production rates (activities) of the 
regulatory proteins CI (repressor) and Cro at the right operator (Or) of 
bacteriophage lambda. The DNA binding energies of CI, Cro, and RNA 
polymerase are perturbed to check the uncertainty of the activity, due to 
the experimental error, by means of a computational scattering method ac- 
cording to which the binding energies are simultaneously chosen at random 
around the literature values, with a width corresponding to the experimen- 
tal error. In a grand canonical ensemble, with the randomly drawn protein- 
DNA binding energies as input, we calculate the corresponding activities 
of the promoters Prm and Pr. By repeating this procedure we obtain a 
mean value of the activity that roughly corresponds to wild-type (unper- 
turbed) activity. The standard deviation emerging from this scheme, a 
measure of the sensitivity due to experimental error, is significant (typi- 
cally > 20% relative to wild- type activity), but still the promoter activities 
are sufficiently separated to make the switch feasible. We also suggest a 
new compact way of presenting repressor and Cro data. 



Dedicated to Joshua Jortner on the occasion of his 70 th birthday. 

Introduction 



The situation is simple: we know the genes, but we do not know how they are 
regulated or transcribed precisely. To understand how genetic networks behave 
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appears a major challenge in the "post genomic" eraP An example of a class of 
small genetic networks, often suitable for theoretical modeling, are the so-called 
genetic switches. Shortly explained, a genetic (regulatory) switch is a system 
consisting of a DNA region (operator) and regulatory protein(s) that are able to 
bind to this operator in order to foster or inhibit the transcription of a certain gene 
of the DNAP Several genetic switching systems have been studied extensively, 
e.g., the tryptophan repressor and the /acOperon in E. coli (procaryotic systems) P 
and regulation of the gal genes in yeaslP (eucaryotic systems).^ 

In this work we want to study the sensitivity upon variations of the protein- 
DNA binding energies of the right operator (Or) of bacteriophage lambda (phage 
A) in respect to experimental error. This operator is in general described else- 
where, e.g., by PtashneP In brief, Or is regulating two important genes to either 
side; cl and cro which in turn act as a template for the regulatory proteins CI and 
Cro, respectively. Upon injection of DNA from phage A into an E. coli bacterium, 
Or is crucially important to decide the fate of the bacterium. I.e., the switch 
funnels entry into the dormant lysogenic state, or into the lytic state leading to 
the formation of new A-phages by help of the facilities of the E. coli cell, and, 
ultimately to the death of the E. coli cell. Partially overlapping the switch are 
the promoter regions Prm, that initiates cl transcription, and Pr, that initiates 
cro transcription. 

We present a new method for analyzing the sensitivity of the activity at the 
two promoters of Or, taking into account the experimental error in the experi- 
ments used to determine the Gibbs free energies (GFEs) of the regulatory pro- 
teins and RNA polymerase (RNAP), by simultaneous random perturbations of 
the GFEs. A new way of presenting repressor data, where Cro data is implicitly 
given, is also discussed. 



Modeling the system 

A fundamental assumption in this work is the widely accepted view that the 
protein-DNA binding/unbinding rates of CI, Cro, and RNAP are in equilibrium, 
i.e., protein associations with DNA are much faster (fractions of a second) com- 
pared with relevant time-scales for protein production and thus activity (sec- 
onds) EEE i n equilibrium, the protein-DNA associations of CI dimers (CI2), Cro 
dimers (Cro 2 ), and RNAP to Or of phage A occur in, presently identified, 40 
experimentally distinguishable states. The associated probability f s for finding 
the system in one of the 40 states s isPEI 

exp ( - AG(s)/(RT)) [CI 2 ]^ [Cro 2 p« [RNAP] fcs 

E s e*p(- AG (s)/(ET)) [CI 2 ]^ [Cro 2 p* [RNAP] fcs ' ( ' 

^Nomenclature: genes are denoted with italicized letters and their protein products with 
Roman letters (first letter capitalized). 
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where R = 8.31 J/(mol K) is the gas constant, T = 310 K is the absolute 
temperature (37°C), and AG(s) is the GFE difference between state s and the 
unoccupied state, i.e., protein-DNA binding energy. All concentrations ([X]) refer 
to the unbound state in solution. i s , j s , and k s are the numbers of CI dimers, 
Cro dimers, and RNAP bound to Or in state s, respectively. 

The different AG(s) in Eq. are in general a sum of GFE originating from 
the individual and cooperative bindings of the proteins at the three different 
binding sites of Or (for details, e.g., see Figure 1 of Shea and Ackerd^). In 
this work we apply GFE data of CI from Koblan and Ackersp' Cro data from 
Darling et al.p^and RNAP data from Ackers et alP These binding energies are 
summarized in Tabled 

Table 1: Protein-DNA binding energies (GFEs) for CI from Koblan and Ackersp^ 
Cro from Darling et al.p and RNAP from Ackers et alP All GFEs are given 
in kcal/mol and limits (±) correspond to 67% confindence intervals. AOi is the 
GFE associated with the binding between CI and operator site OrI, etc. (see, e.g., 
Ptashne^for an explanation/illustration of the different operator sites). AG12 is 
the GFE associated with coopertaive binding between CI at OrI and Or2, etc. 
GFEs with a prime (e.g., AGp) correspond to Cro data, otherwise analogous to 
CI notation. AOrm and AGr are GFEs associated with binding of RNAP to Prm 
and Pr, respectively. Experimental data are obtained in vitro in 200 mM KC1, 
resembling "physiological" conditions.^^ CI and Cro are both assumed to obey 
a monomer-dimer equilibrium in solution where the free energies of dimerization 
are —11.0 kcal/moP^and —8.7 kcal/mol,^ respectively. In lack of Cro data at 
37°C, at which temperature CI and RNAP data are measured, these data are 
obtained at 20° C. 



CI 


AGi 


-12.5 ± 0.3 




AG 2 


-10.5 ± 0.2 




AG 3 


-9.5 ± 0.2 




AG 12 


-2.7 ± 0.3 




AG 23 


-2.9 ± 0.5 


Cro 


AGr 


-12.0 ± 0.1 




AG 2 > 


-10.8 ± 0.1 




AG 3 , 


-13.4 ± 0.1 




AG l2 , 


-1.0 ± 0.2 




AG23' 


-0.6 ± 0.2 




AG123' 


-0.9 ± 0.2 


RNAP 


AGrm 


-11.5 ± 0.5 




AOr 


-12.5 ± 0.5 



In Table 121 we list the corresponding 40 different states of protein-DNA asso- 
ciations. Throughout this work we have for simplicity assumed a constant free 
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Table 2: Gibbs free energies (GFEs) of the different protein associations to Or 
of phage A (in state s) of CI dimers (R)F Cro dimers(C)F and RNAPP "0": 
empty site, "< — ►": cooperative interaction, and "Terms": GFE terms due to 
Tabled GFEs are measured in kcal/mol relative to the unbound state of zero 
GFE (Reference state; s = 1). 



s 


Or 3 


Or 2 




OrI 


Terms 




GFE 


1 













Reference state 







2 










R 


Ad 






-12.5 


3 





R 







AG 2 






-10.5 


4 


R 










AG 3 






-9.5 


5 










c 


AG V 






-12.0 


6 





c 







AG 2 > 






-10.8 


7 


c 










AG 3 , 






-13.4 


8 


RNAP 










AGrm 




-11.5 


9 







RNAP 




ag r 






-12.5 


10 





R 


i > 


R 


AGi 


+ AG 2 + AGi 2 




-25.7 


11 


R 







R 


AGi 


+ AG 3 




-22.0 


12 


R <- 


-> R 







AG 2 


+ AG 3 + AG 23 




-22.9 


13 





C 




c 


AGi- 


+ AG 2 ' + AGi 2 ' 




-23.8 


14 


c 







c 


AGy 


+ AG 3 - 




-25.4 


15 


c <- 


-> c 







AG 2 ' 


+ AG 3 / + AG 23 < 




-24.8 


16 


RNAP 




RNAP 




AGrm + AG R 




-24.0 


17 





c 




R 


AGi 


+ AG 2 / 




-23.3 


18 





R 




C 


AGi' 


+ AG 2 




-22.5 


19 


R 







C 


AGi- 


+ AG 3 




-21.5 


20 


C 







R 


AGi 


+ AG 3 , 




-25.9 


21 


R 


c 







AG 2 ' 


+ AG 3 




-20.3 


22 


C 


R 







AG 2 


+ AG 3 ' 




-23.9 


23 


R 




RNAP 




AG R 


+ AG 3 




-22.0 


24 


RNAP 


R 







AG 2 


+ AGrm 




-22.0 


25 


RNAP 







R 


AGi 


+ AGrm 




-24.0 


2G 


C 




RNAP 




AG R 


+ AG 3 - 




-25.9 


27 


RNAP 


c 







AG 2 ' 


+ AGrm 




-22.3 


28 


RNAP 







c 


AGi' 


+ AGrm 




-23.5 


29 


R 


R 




R 


AGi 


+ AG 2 + AG 3 + 


AG 12 


-35.2 


30 


C <- 


c 




C 


AGi' 


+ AG 2 / + AG 3 / 


4 AGi 23 / 


-37.1 


31 


c 


R 




R 


AGi 


+ AG 2 + AG 3 / 4 


AGi 2 


-39.1 


32 


R 


C 




R 


AGi 


+ AG 2 / + AG 3 




-32.8 


33 


R <- 


-> R 




G 


AGy 


+ AG 2 + AG 3 4 


AG 23 


-34.9 


34 


R 


C 




C 


AGy 


4 AG 2 ' 4 AG 3 - 


h AG12' 


-33.3 


35 


C 


R 




C 


AGy 


4 AG 2 4 AG 3 ' 




-35.9 


3G 


C «- 


-» C 




R 


AGi 


4 AG 2 / 4 AG 3 / - 


h AG 23 / 


-37.3 


37 


RNAP 


R 




R 


AGi 


4 AG 2 4 AGrm 


4 AGi 2 


-37.2 


38 


RNAP 


C 




C 


AGi< 


4 AG 2 ' 4 AGrm 4 AGi 2 - 


-35.3 


39 


RNAP 


C 




R 


AGi 


4 AG 2 / 4 AGrm 




-34.8 


40 


RNAP 


R 




C 


AGy 


4 AG 2 4 AGrm 




-34.0 
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RNAP concentration of 30 nM.^Note that in lack of Cro data at 37°C, at which 
temperature CI and RNAP data are taken, the Cro data used in the following 
were obtained at 20°C. It is assumed that the latter data provide a reasonable 
estimate for the process at 37°C. 

The main purpose of this paper is to study the sensitivity of production rates 
(activities) with respect to the experimental error of the GFEs. To this end, we 
assume that the transcription initiation (isomerization rate) is the rate-limiting 
step in protein synthesis) 15 * 16 * Accordingly, activity will be defined as the product 
of isomerization rate times the probability of RNAP occupancy of the promoter. 
The latter probability is a sum of the f s in Eq. (JTJ). In what follows, we use the 
same rate constants as Shea and Ackers in enumerating these activities.^ 

Results and discussion 

In a previous study we analyzed the sensitivity of Or through a systematic one- 
by-one perturbation scheme of the GFEs, with a data set without monomer- 
dimer equilibrium for CroP^ Each individual GFE (corresponds to those in Table 
was perturbed ±1 kcal/mol, one-by-one, whereupon the change in activity 
compared to wild-type (unperturbed) activity was calculated. Bakk et al. show 
in this work that for a lysogen the sensitivity of the activity is low (upon CI and 
Cro perturbations), while this sensitivity is increasing for protein concentrations 
around induction where the A-switch is turning over from the lysogenic to the 
lytic pathway. 

The distinct novel feature in this work is that we perform a computational 
scattering method, where the different GFEs are randomly chosen in the param- 
eter hyperspace and applied in the model simultaneously. This implies that for 
each GFE (see Table HJ), we draw from a Gaussian distribution with standard de- 
viation equal to the experimental uncertainty (indicated by ± in TableQ)." Then, 
13 new values for the GFE are obtained and the activities at both promoters are 
then evaluated. This procedure is performed 10 3 times, which we checked to be 
significant to ensure reliable statistics, whereupon the mean value (mean) and 
standard deviation (SD) are calculated from this set. The latter value will reflect 
typical uncertainty of the activities due to the experimental error of the GFEs. 
Here we define the sensitivity of the activity as the ratio between the standard 
deviation ensuing the computational scattering and wild-type (unperturbed) ac- 
tivity. 

Figure shows how the parameter AG\ is scattered around the mean value 
-12.5 kcal/mol, with SD of 0.3 kcal/mol as given in Tabled for 1000 realizations 
(random draws). Figured give a corresponding example of how activity is spread 
due to variations of all GFEs in the same run. Note in this particular example that 

"67% confidence interval corresponds to a Gaussian distribution around the mean value with 
standard deviation equal to the experimental uncertainty. 
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Figure 1: a) Scattering of the protein-DNA binding energy AGi randomly drawn 
from a Gaussian distribution with mean of -12.5 kcal/mol (horizontal line) and 
SD of 0.3 kcal/mol (see Tabled]). The latter value corresponds to 67% confidence 
intervals in the experiments, b) Corresponding scattering of the activity, due to 
variations of all GFEs in the same run, at promoter P-ru for [CI t ] = 200 nM and 
zero Cro concentration (typical for a lysogen). Continous horizontal line 

( ) corresponds to wild type activity (0.0081 s^ 1 ) and scattered horizontal 

line ( ) corresponds to the mean activity of the 1000 scattered values in this 

plot (0.0077 s _1 ). "Event refers to the number in the series of the randomly 
drawn binding energies out of 1000 realizations. 
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some of the scattered data points are shifted toward very low values leading to a 
mean value of the scattered activities which is lower (0.0077 s _1 ) compared to the 
wild-type activity (0.0081 s -1 ). However, as also discussed below, skewness, here 
and in the other simulations, is not very pronounced. The obtained mean values 




Figure 2: Promoter activity versus total CI concentration for [Cro t ] ~ 0. Prm 
corresponds to cl activity and Pr corresponds to cro activity. Fully drawn curves 
( "wild- type" ) correspond to experimental GFE data listed in Table Q where CI 
data are from Koblan and Ackers,^ Cro data from Darling et al.P and RNAP 
data from Ackers et alP Promoter activity corresponds to the number of RNAP- 
DNA complexes that becomes transcriptionally active per second. "Scattering" 
( X ) are mean values of the activities obtained from the computational scattering 
(described in main text) associated with standard deviations (only indicated for 
deviations > 0.3 x 10 -3 s _1 ). Thin vertical line indicates lysogenic concentration 
(« 200 nM). Abscissa is drawn on logarithmic (decadic) scale. 

are very close to the wild-type values. This is not a priori obvious, because these 
values originate from random draws in a Gaussian distribution of the GFEs, which 
in turn enters exponents in the grand canonical partition function (Eq. (JI}) that 
might produce a skewness in the distribution of the activities around the mean. A 
general feature is that the SD relative to the wild-type activity, i.e., the sensitivity, 
is large and that the sensitivity is largest for a combination of moderate/large 
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repressor concentrations and low activity. On the other hand, it is known from 
experiments that the robustness upon perturbations, in particular of the lysogenic 
state, is nighP^^ Thus, in light of these latter mentioned studies, and despite 
the resulting large uncertainty of the activities due to the experimental error, as 
found here, a lysogen remains stable due to the perturbations. 

In order to study the sensitivity of the activity around induction, i.e., at 
concentrations where CI production is replaced by Cro production, we perform 
an analogous scattered computation as in Figure 121 but this time the total Cro 
concentration ([Cro t ]) is 50 nM. The latter value may represent a typical Cro 
concentration around induction.^-^ Compared with [Crot] ~ the sensitivity of 
the activity is higher in this concentration regime (see Figure EJ). Accordingly, the 




Log [CI t ] (M) 



Figure 3: Promoter activity versus total CI concentration for [Cro t ] ~ 50 nM. 
See also figure caption of Figure El 

activities of both Prm and Pr are also reduced, which is reasonable because an 
increased Cro concentration implies increased Cro occupancy at both promoters 
and transcription occurs less frequently. We also test the case [Cro t ] ~ 200 nM 
(typical lytic concentration), that leads to smaller activity than the two previous 
cases. Due to the small activities, the sensitivity is high in this case. 

Figures El and El present the sensitivity of the activity, for a given Cro concen- 
tration, versus CI concentration. However, this might be done in a more compact 
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way as shown in the following. The rate of Cro production may be written as 
(used to produce Figure EJpEl 



<lFpA = 1(r » S R PR -^-^, (2) 

at T di i Tdeg 

where [Crot] is the total Cro concentration in nM. S ~ 20 is the average number 
of Cro made from each transcript and R ~ 2.5 x 10~ 2 s~ x is the rate of transcript 
initiation, both estimated by Aurell et alP is the probability of RNAP occu- 
pancy of promoter Pr calculated from Eq. (|T|). r^i ~ 34 min is the life time of 
a cell generation,^ and T^ eg » 2600 s is the in vivo half-life time of Cro due to 
degradation.^ The prefactor 10~ 9 is simply a conversion factor when going from 
numbers (of proteins) to concentrations, assuming an average cellular volume of 
2 x 10" 15 liters. 

We now assume Cro production to be in equilibrium, i.e., d [Cro t ]/<it = in 
Eq. P|, which is a reasonable assumption because the Cro production occurs on 
time scale of seconds, while, for instance, a cell generation is of the order of half 
an hourP Thus, for a given repressor concentration we are now able to estimate 
the Cro concentration (see Figure HJi). One should note that the parameters in 
Eq. (J2J) are associated with large uncertainty (~ 20%), however, this method is 
a valuable supplement to the presentation in Figures El and 01 

Above we investigated the sensitivity of the activity of the promoters by as- 
suming a fixed Cro concentration (Figures El and EJ). In Figure HJd we show the 
sensitivity of the activity of the promoters by applying the self-consistent method 
that corresponds to Figure E^. The activity at Pr is reduced for [CI t ] < 10 nM, 
compared with the situation in Figures El and El This makes sense, because due 
to Figure 0^ [Crot] ~ 150 nM for [CIt] < 10 nM resulting in a self-repression 
of Cro. Prm is also repressed by Cro in this concentration regime, leading to a 
zero activity. We find that the sensitivity of the activity is at the same level as 
in the previous analysis, with a standard deviation of the activity relative to the 
wild- type activity > 20%. 

Finally, we implement the computational scattering method with a flat dis- 
tribution in an interval ±1.5 xSD, where SD is the standard deviation in Table 
which corresponds to 67% confidence intervals. E.g., AG2 is drawn at random 
in the interval from —10.8 kcal/mol to —10.2 kcal/mol. This results in a mean 
value of the activity similar to the wild-type and a sensitivity of the same order 
as obtained in the Gaussian scattering presented. Thus, the random scatter- 
ing method seems to be rather insensitive to the functional form of the random 
drawing distribution function. 
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-9 -8 -7 -6 -5 
Log [CI t ] (M) 

Figure 4: a) Total Cro concentration vs. total repressor concentration (loga- 
rithmic scale) where Cro concentration is determined self-consistently via the 
equilibrium ansatz d[Cro t ]/dt = in Eq. (J2J). b) Promoter activity versus total 
CI concentration where Cro concentration is determined self-consistently. Note 
that the rise of the Prm curve (around 50 nM) is much sharper compared to 
the situation in Figures El and El indicating a larger cooperativity when the Cro 
concentration is determined in the self-consistent way (feedback). 
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Summary and conclusion 



The main purpose of this work was to study the sensitivity of the production 
rates (activities) of the regulatory proteins CI and Cro associated with Or (a 
genetic switch) in phage A. The bindings of these regulatory proteins and RNA 
polymerase to DNA are assumed to be in equilibrium. Thus, by applying a 
grand canonical approach (statistical open system as presented in Eq. (£Q)) 9 6 
we are able to find the probability of binding to Or, whereupon we calculate 
the rates of CI and Cro production (activities). We perform the computational 
scattering during which each of the 13 different protein-DNA binding energies are 
randomly drawn from a Gaussian distribution with mean equivalent to wild-type 
GFE and standard deviation corresponding to experimental error. Then, the 
corresponding activities associated with promoters Prm and Pr are calculated. 
This is performed 10 3 times, whereupon the mean and standard deviation of the 
resulting activities are evaluated. 

The mean value emerging from this computational scattering scheme is in 
general close to wild-type activity, where the latter is calculated from the ex- 
perimentally (wild-type) given values. The relative sensitivity of the activity, 
defined as the ratio between the standard deviation ensuing the "scattering" and 
wild- type (unperturbed) activity, is in most cases > 20%. The sensitivity of the 
Prm activity for a lysogen, where CI concentration typically is around 200 nM 
while Cro concentration is zero, is around 20%. Thus, according to Bailone et 
al.,^ perturbations of the activities of the size as performed in this work (0.1-0.5 
kcal/mol) are not enough to destabilize a lysogen. The Pr activity for a lysogen 
is highly sensitive, however one should note that wild-type activity of Pr is here 
negligible. Around induction, where both CI and Cro concentrations are at com- 
parable levels (25-50 nM) the sensitivity of the activity is high (> 50%). The 
latter is also the case in the lytic regime where Cro is dominating. Despite the 
relatively large error, the activities of the two promoters seem to be separated 
within the error (see Figures |2l El andEJ) making the switch feasible. 

We note that the perturbations performed here (and conclusions) may to 
some extent take into account cell-to-cell variations of the concentrations of the 
proteins, i.e., noise, which effectively may be viewed as variations in the binding 
energies. However, in order to study noise systematically one should, in the same 
manner as we scattered GFEs randomly, choose the protein concentrations at 
randomP^ 

We also make an equilibrium ansatz for Cro production, by which we are able 
to calculate, for a given Cro concentration, the corresponding repressor concen- 
tration. This method leads to a more "compact" presentation of data, because 
then only the CI concentration is a real variable due to the fact that the Cro 
concentration is implicitly given, or vice versa. The sensitivity of the activity of 
the two promoters, due to the latter method, is of the same size as we previously 
obtained in this work with fixed Cro concentrations. 
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